首页> 外文OA文献 >Local scheduling techniques for memory coherence in a clustered VLIW processor with a distributed data cache
【2h】

Local scheduling techniques for memory coherence in a clustered VLIW processor with a distributed data cache

机译:具有分布式数据缓存的集群VLIW处理器中存储器一致性的本地调度技术

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Clustering is a common technique to deal with wire delays. Fully-distributed architectures, where the register file, the functional units and the cache memory are partitioned, are particularly effective to deal with these constraints and besides they are very scalable. However the distribution of the data cache introduces a new problem: memory instructions may reach the cache in an order different to the sequential program order, thus possibly violating its contents. In this paper two local scheduling mechanisms that guarantee the serialization of aliased memory instructions are proposed and evaluated: the construction of memory dependent chains (MDC solution), and two transformations (store replication and load-store synchronization) applied to the original data dependence graph (DDGT solution). These solutions do not require any extra hardware. The proposed scheduling techniques are evaluated for a word-interleaved cache clustered VLIW processor (although these techniques can also be used for any other distributed cache configuration). Results for the Mediabench benchmark suite demonstrate the effectiveness of such techniques. In particular, the DDGT solution increases the proportion of local accesses by 16% compared to MDC, and stall time is reduced by 32% since load instructions can be freely scheduled in any cluster However the MDC solution reduces compute time and it often outperforms the former. Finally the impact of both techniques on an architecture with attraction buffers is studied and evaluated.
机译:聚类是处理线路延迟的常用技术。对寄存器文件,功能单元和高速缓存进行分区的全分布式体系结构在应对这些约束方面特别有效,而且它们具有很高的可扩展性。但是,数据高速缓存的分配带来了一个新问题:内存指令可能以与顺序程序顺序不同的顺序到达高速缓存,从而可能违反其内容。本文提出并评估了两种保证别名内存指令序列化的本地调度机制:内存依赖链的构造(MDC解决方案)以及应用于原始数据依赖图的两种转换(存储复制和负载存储同步) (DDGT解决方案)。这些解决方案不需要任何额外的硬件。针对字交织的缓存群集VLIW处理器评估了建议的调度技术(尽管这些技术也可以用于任何其他分布式缓存配置)。 Mediabench基准套件的结果证明了这种技术的有效性。特别是,与MDC相比,DDGT解决方案将本地访问的比例提高了16%,并且由于可以在任何群集中自由调度加载指令,因此停顿时间减少了32%。但是MDC解决方案减少了计算时间,并且其性能通常优于前者。最后,研究和评估了这两种技术对具有吸引力缓冲区的体系结构的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号